A lightweight optimization selection method for Sparse Matrix-Vector Multiplication
نویسندگان
چکیده
In this paper, we propose an optimization selection methodology for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel. We propose two models that attempt to identify the major performance bottleneck of the kernel for every instance of the problem and then select an appropriate optimization to tackle it. Our first model requires online profiling of the input matrix in order to detect its most prevailing performance issue, while our second model only uses comprehensive structural features of the sparse matrix. Our method delivers high performance stability for SpMV across different platforms and sparse matrices, due to its application and architecture awareness. Our experimental results demonstrate that a) our approach is able to distinguish and appropriately optimize special matrices in multicore platforms that fall out of the standard class of memory bandwidth bound matrices, and b) lead to a significant performance gain of 29% in a manycore platform compared to an architecture-centric optimization, as a result of the successful selection of the appropriate optimization for the great majority of the matrices. With a runtime overhead equivalent to a couple dozen SpMV iterations, our approach is practical for use in iterative numerical solvers of real-life applications.
منابع مشابه
Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors
The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...
متن کاملRun-Time Optimization of Sparse Matrix-Vector Multiplication on SIMD Machines
Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientific computations (e.g., finite element methods). In such solvers, the matrix-vector product is computed repeatedly, often thousands of times, with updated values of the vector until convergence is achieved. In an SIMD architecture, each processor has to fetch the updated off-processor vector el...
متن کاملOptimizing Sparse Matrix Vector Multiplication on SMPs
We describe optimizations of sparse matrix-vector multiplication on uniprocessors and SMPs. The optimization techniques include register blocking, cache blocking, and matrix reordering. We focus on optimizations that improve performance on SMPs, in particular, matrix reordering implemented using two diierent graph algorithms. We present a performance study of this algorithmic kernel, showing ho...
متن کاملOptimizing Sparse Matrix Computations for Register Reuse in SPARSITY
Sparse matrix-vector multiplication is an important computational kernel that tends to perform poorly on modern processors, largely because of its high ratio of memory operations to arithmetic operations. Optimizing this algorithm is difficult, both because of the complexity of memory systems and because the performance is highly dependent on the nonzero structure of the matrix. The Sparsity sy...
متن کاملRun - Time Optimization of Sparse Matrix - Vector Multiplication onSIMD
Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientiic computations (e.g., nite element methods). In such solvers, the matrix-vector product is computed repeatedly, often thousands of times, with updated values of the vector until convergence is achieved. In an SIMD architecture, each processor has to fetch the updated oo-processor vector elemen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1511.02494 شماره
صفحات -
تاریخ انتشار 2015